Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition

نویسندگان

  • Jeff Z. Ma
  • Li Deng
چکیده

This paper reports our on-going work aimimg to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which e ciently computes the likelihood of any observation utterance while optimizing the dynamic regimes in the speech model. The e ectiveness of the algorithm is tested in simulation experiments It is also tested on Switchboard data where the optimized dynamic regimes by the search algorithm are compared with those from exhaustive search. Finally, we show speech recognition results on Switchboard data that demonstrate improvements of the recognizer's performance compared with use of the dynamic regimes heuristically set from the phone segmentation by a stateof-the-art HMM system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech

In this paper we report our recent research whose goal is to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which efficiently computes the likelihood of any observation utterance while optimizing the dynamic regimes in th...

متن کامل

A path-stack algorithm for optimizing dynamic regimes

In this paper we report our recent research whose goal is to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which efficiently computes the likelihood of any observation utterance while optimizing the dynamic regimes in th...

متن کامل

Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech

Non-verbal vocalisations such as laughter, breathing, hesitation, and consent play an important role in the recognition and understanding of human conversational speech and spontaneous affect. In this contribution we discuss two different strategies for robust discrimination of such events: dynamic modelling by a broad selection of diverse acoustic Low-Level-Descriptors vs. static modelling by ...

متن کامل

A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition

An overview of a statistical paradigm for speech recognition is given where phonetic and phonological knowledge sources, drawn from the current understanding of the global characteristics of human speech communication, are seamlessly integrated into the structure of a stochastic model of speech. A consistent statistical formalism is presented in which the submodels for the discrete, feature-bas...

متن کامل

Efficient decoding strategy for conversational speech recognition using state-space models for vocal-tract-resonance dynamics

In this paper, we present an efficient strategy for likelihood computation and decoding in a continuous speech recognizer using underlying state-space dynamic models for the hidden speech dynamics. The state-space models have been constructed in a special way so as to be suitable for the conversational or casual style of speech where phonetic reduction abounds. The interacting multiple model (I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999